Learning Semantic Representations for Nonterminals in Hierarchical Phrase-Based Translation
نویسندگان
چکیده
In hierarchical phrase-based translation, coarse-grained nonterminal Xs may generate inappropriate translations due to the lack of sufficient information for phrasal substitution. In this paper we propose a framework to refine nonterminals in hierarchical translation rules with real-valued semantic representations. The semantic representations are learned via a weighted mean value and a minimum distance method using phrase vector representations obtained from large scale monolingual corpus. Based on the learned semantic vectors, we build a semantic nonterminal refinement model to measure semantic similarities between phrasal substitutions and nonterminal Xs in translation rules. Experiment results on ChineseEnglish translation show that the proposed model significantly improves translation quality on NIST test sets.
منابع مشابه
Learning Bilingual Distributed Phrase Representations for Statistical Machine Translation
Following the idea of using distributed semantic representations to facilitate the computation of semantic similarity between translation equivalents, we propose a novel framework to learn bilingual distributed phrase representations for machine translation. We first induce vector representations for words in the source and target language respectively, in their own semantic space. These word v...
متن کاملGeneralizing Hierarchical Phrase-based Translation using Rules with Adjacent Nonterminals
Hierarchical phrase-based translation (Hiero, (Chiang, 2005)) provides an attractive framework within which both shortand longdistance reorderings can be addressed consistently and ef ciently. However, Hiero is generally implemented with a constraint preventing the creation of rules with adjacent nonterminals, because such rules introduce computational and modeling challenges. We introduce meth...
متن کاملTowards Bidirectional Hierarchical Representations for Attention-based Neural Machine Translation
This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder. To maximize the predictive likelihood of target words, a weighted variant of an attention mechanism is used to balance the attentive information between lexical...
متن کاملCCG augmented hierarchical phrase-based machine translation
We present a method to incorporate target-language syntax in the form of Combinatory Categorial Grammar in the Hierarchical Phrase-Based MT system. We adopt the approach followed by Syntax Augmented Machine Translation (SAMT) to attach syntactic categories to nonterminals in hierarchical rules, but instead of using constituent grammar, we take advantage of the rich syntactic information and fle...
متن کاملNTT statistical machine translation for IWSLT 2006
We present the NTT translation system that is experimented for the evaluation campaign of “International Workshop on Spoken Language Translation (IWSLT).” The system consists of two primary components: a hierarchical phrase-based statistical machine translation system and a reranking system. The former is conceptualized as a synchronous-CFG in which phrases are hierarchically combined using non...
متن کامل